25 research outputs found

    Differential and coherent processing patterns from small RNAs

    Get PDF
    Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24nt short RNAs, the unannotated loci comparatively consist of ~17nt short RNAs. Furthermore, these ~17nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing

    Emerging applications of read profiles towards the functional annotation of the genome

    Get PDF
    Functional annotation of the genome in various species is important to understand their phenotypic complexity. The road towards functional annotation involves several challenges ranging from experiments on individual molecules to large-scale analysis of high-throughput sequencing (HTS) data. HTS data is typically a result of the protocol designed to address specific research questions. The sequencing results in reads, which when mapped to a reference genome often leads to the formation of distinct patterns (read profiles). Interpretation of these read profiles are essential for the analysis in relation to the research question addressed. Several strategies have been employed at varying levels of abstraction ranging from a somewhat ad hoc to a more systematic analysis of read profiles. These include methods which can compare read profiles, e.g. from direct (non-sequence based) alignments to classification of patterns into functional groups. In this review, we highlight the emerging applications of read profiles for the annotation of non-coding RNA and cis-regulatory regions such as enhancers and promoters. We also discuss the biological rationale behind their formation

    Peak-valley-peak pattern of histone modifications delineates active regulatory elements and their directionality

    Get PDF
    Formation of nucleosome free region (NFR) accompanied by specific histone modifications at flanking nucleosomes is an important prerequisite for enhancer and promoter activity. Due to this process, active regulatory elements often exhibit a distinct shape of histone signal in the form of a peak-valley-peak (PVP) pattern. However, different features of PVP patterns and their robustness in predicting active regulatory elements have never been systematically analyzed. Here, we present PARE, a novel computational method that systematically analyzes the H3K4me1 or H3K4me3 PVP patterns to predict NFRs. We show that NFRs predicted by H3K4me1 and me3 patterns are associated with active enhancers and promoters, respectively. Furthermore, asymmetry in the height of peaks flanking the central valley can predict the directionality of stable transcription at promoters. Using PARE on ChIP-seq histone modifications from four ENCODE cell lines and four hematopoietic differentiation stages, we identified several enhancers whose regulatory activity is stage specific and correlates positively with the expression of proximal genes in a particular stage. In conclusion, our results demonstrate that PVP patterns delineate both the histone modification landscape and the transcriptional activities governed by active enhancers and promoters, and therefore can be used for their prediction. PARE is freely available at http://servers.binf.ku.dk/pare

    Ancestrally duplicated conserved noncoding element suggests dual regulatory roles of HOTAIR in <i>cis </i>and <i>trans</i>

    Get PDF
    HOTAIR was proposed to regulate either HoxD cluster genes in trans or HoxC cluster genes in cis, a mechanism that remains unclear. We have identified a 32-nucleotide conserved noncoding element (CNE) as HOTAIR ancient sequence that likely originated at the root of vertebrate. The second round of whole-genome duplication resulted in one copy of the CNE within HOTAIR and another copy embedded in noncoding transcript of HOXD11. Paralogous CNEs underwent compensatory mutations, exhibit sequence complementarity with respect to transcripts directionality, and have high affinity in vitro. The HOTAIR CNE resembled a poised enhancer in stem cells and an active enhancer in HOTAIR-expressing cells. HOTAIR expression is positively correlated with HOXC11 in cis and negatively correlated with HOXD11 in trans. We propose a dual modality of HOTAIR regulation where transcription of HOTAIR and its embedded enhancer regulates HOXC11 in cis and sequence complementarity between paralogous CNEs suggests HOXD11 regulation in trans.publishedVersio

    The splicing factor RBM25 controls MYC activity in acute myeloid leukemia

    Get PDF
    Splicing factors are often mutated in hematological malignancies. Here, the authors perform an in vivo shRNA screen in a CEBPA mutant AML mouse model and identify that RBM25 controls the splicing of pre-mRNAs encoding BCL-X and BIN1 to exert its tumour suppressor activities in AML

    Enhancer and Transcription Factor Dynamics during Myeloid Differentiation Reveal an Early Differentiation Block in <i>Cebpa null</i> Progenitors

    Get PDF
    Transcription factors PU.1 and CEBPA are required for the proper coordination of enhancer activity during granulocytic-monocytic (GM) lineage differentiation to form myeloid cells. However, precisely how these factors control the chronology of enhancer establishment during differentiation is not known. Through integrated analyses of enhancer dynamics, transcription factor binding, and proximal gene expression during successive stages of murine GM-lineage differentiation, we unravel the distinct kinetics by which PU.1 and CEBPA coordinate GM enhancer activity. We find no evidence of a pioneering function of PU.1 during late GM-lineage differentiation. Instead, we delineate a set of enhancers that gain accessibility in a CEBPA-dependent manner, suggesting a pioneering function of CEBPA. Analyses of Cebpa null bone marrow demonstrate that CEBPA controls PU.1 levels and, unexpectedly, that the loss of CEBPA results in an early differentiation block. Taken together, our data provide insights into how PU.1 and CEBPA functionally interact to drive GM-lineage differentiation

    Structured RNAs and synteny regions in the pig genome

    Get PDF
    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure similarity search as well as class specific methods, we obtained a conservative set with a total of 3,391 structured RNA loci of which 1,011 and 2,314, respectively, hold strong sequence and structure similarity to structured RNAs in existing databases. The RNA loci cover 139 cis-regulatory element loci, 58 lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome, we obtained no matches at the highest confidence level. Additional analysis of RNA-seq data from a pooled library from 10 different pig tissues added another 165 miRNA loci, yielding an overall annotation of 3,556 structured RNA loci. This annotation represents our best effort at making an automated annotation. To further enhance the reliability, 571 of the 3,556 structured RNAs were manually curated by methods depending on the RNA class while 1,581 were declared as pseudogenes. We further created a multiple alignment of pig against 20 representative vertebrates, from which RNAz predicted 83,859 de novo RNA loci with conserved RNA structures. 528 of the RNAz predictions overlapped with the homology based annotation or novel miRNAs. We further present a substantial synteny analysis which includes 1,004 lineage specific de novo RNA loci and 4 ncRNA loci in the known annotation specific for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70 and the complete annotation is available at http://rth.dk/resources/rnannotator/susscr102/version1.02. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-459) contains supplementary material, which is available to authorized users

    Mutant CEBPA directly drives the expression of the targetable tumor-promoting factor CD73 in AML

    Get PDF
    The key myeloid transcription factor (TF), CEBPA, is frequently mutated in acute myeloid leukemia (AML), but the direct molecular effects of this leukemic driver mutation remain elusive. To investigate mutant AML, we performed microscale, in vivo chromatin immunoprecipitation sequencing and identified a set of aberrantly activated enhancers, exclusively occupied by the leukemia-associated CEBPA-p30 isoform. Comparing gene expression changes in human mutant AML and the corresponding mouse model, we identified , encoding CD73, as a cross-species AML gene with an upstream leukemic enhancer physically and functionally linked to the gene. Increased expression of CD73, mediated by the CEBPA-p30 isoform, sustained leukemic growth via the CD73/A2AR axis. Notably, targeting of this pathway enhanced survival of AML-transplanted mice. Our data thus indicate a first-in-class link between a cancer driver mutation in a TF and a druggable, direct transcriptional target

    deepBlockAlign: a tool for aligning RNA-seq profiles of read block patterns

    Get PDF
    Motivation: High-throughput sequencing methods allow whole transcriptomes to be sequenced fast and cost-effectively. Short RNA sequencing provides not only quantitative expression data but also an opportunity to identify novel coding and non-coding RNAs. Many long transcripts undergo post-transcriptional processing that generates short RNA sequence fragments. Mapped back to a reference genome, they form distinctive patterns that convey information on both the structure of the parent transcript and the modalities of its processing. The miR-miR* pattern from microRNA precursors is the best-known, but by no means singular, example
    corecore